fix: gate prompt caching by provider, not model name only#22
Open
sumleo wants to merge 1 commit into
Open
Conversation
Author
|
Hi @ResearAI, gentle nudge on this when you have a moment. It's a small, self-contained prompt-caching fix, and I'm happy to rebase or tweak anything if that would make review easier. Thanks for the project and your time! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Make prompt caching provider-aware so it actually works for the documented
anthropicandopenrouterconfigurations, not just Bedrock.Why
Prompt caching is currently wired up for Bedrock only, even though the README documents
AI_PROVIDER=anthropicandAI_PROVIDER=openrouteras supported setups. Two issues combine so that caching silently no-ops outside Bedrock:frontend/app/api/chat/route.ts(cache breakpoints, previously ~L317/L337/L347) — the cache markers were hardcoded to the Bedrock namespaceproviderOptions.bedrock.cachePoint, and applied wheneversupportsPromptCaching(modelId)returned true. With@ai-sdk/anthropic, the adapter expectsproviderOptions.anthropic.cacheControl = { type: "ephemeral" }; thebedrocknamespace is just ignored, so no cache breakpoints are ever sent and nothing gets cached.frontend/lib/ai-providers.ts(supportsPromptCaching, previously ~L683) — gating was on the model-name substring only (claude/anthropic), with no provider dimension. So a Claude model on a provider that can't honor a Bedrock cache marker still reported caching as enabled.Fix
supportsPromptCaching(modelId, provider?)now takes the resolved provider and only enables caching for providers whose AI SDK adapter understands a Claude cache marker (bedrock,anthropic,openrouter). Theproviderargument is optional and falls back to the old model-name-only behaviour, so existing callers keep working.getAIModelnow returns the resolvedprovider(it was already computed internally), soroute.tscan pass it through.getCacheBreakpointProviderOptions(provider)emits the correct marker shape per provider:bedrock→{ bedrock: { cachePoint: { type: "default" } } }(unchanged)anthropic/openrouter(and other Claude-capable adapters) →{ anthropic: { cacheControl: { type: "ephemeral" } } }route.tsuses that helper for all three cache breakpoints (last assistant message + the two system breakpoints) instead of the hardcoded Bedrock object.The OpenRouter adapter reads
anthropic.cacheControlfromproviderOptionsand maps it to its owncache_controlfield, so the Anthropic marker shape covers bothanthropicandopenrouter.Testing
The repo has no test runner (Biome only).
npx biome cion the two changed files is clean. Note:npm run checkreports pre-existing errors/warnings across other files that are already present onmain(identical counts before and after this change); this PR introduces none.Related
Open PR #20 ("custom provider protocol support") touches the Python
autofigure/*code and some autofigure frontend components, but notapp/api/chat/route.tsorlib/ai-providers.ts, so there's no overlap with this fix.